Online-Academy
Look, Read, Understand, Apply

Data Mining And Data Warehousing

Linear Classification

Linear Classifier: A linear classifier uses linear combination of features of each object to a classification decision. Such classifiers work well for problems like document classification, and more generally for problems with many variables (features), reaching accuracy levels comparable to non-linear classifiers while taking less time to train and use.

For a two-class classification problem, one can visualize the operation of a linear classifier as splitting a high-dimensional input space with a hyperplane: all points on one side of the hyperplane are classified as "yes", while the others are classified as "no".

Linear Regression: A linear regression model estimates the relationship between a dependent variable(response variable) and one or more regressor or independent variable (explanatory variable). A simple linear regression has exactly one explanatory variable is ; a multiple linear regression model has two or more explanatory variables.

Support Vector Machine:A Support Vector Machine (SVM) is a supervised machine learning algorithm generally used for classification, though it can also be used for regression. It is particularly well-suited for binary classification tasks, but it can be extended to multi-class classification as well.

SVM aims to find the optimal hyperplane that best separates the data into different classes. The best hyperplane is the one that has the largest margin between the two classes — meaning the maximum distance between the hyperplane and the closest data points from each class (called support vectors).

Hyperplane: A decision boundary that separates different classes. In 2D, it's a line; in 3D, it's a plane; in higher dimensions, it's a hyperplane.
Margin:The distance between the hyperplane and the nearest data points from each class.
Support Vectors: Data points that lie closest to the hyperplane. These determine the position and orientation of the hyperplane.
Kernel Trick: A technique used to transform data into a higher-dimensional space to make it linearly separable. Common kernels: linear, polynomial, radial basis function (RBF).

Imagine you're classifying emails as spam or not spam.

  • The SVM tries to find a line (or plane) that best separates spam from non-spam emails.
  • It chooses the line that maximizes the distance from the nearest spam and non-spam emails.
  • If spam and non-spam emails can't be separated linearly, SVM uses a kernel (like RBF) to map them into a higher-dimensional space where a linear separation is possible.